Reasoning about Agent Programs using ATL-like Logics 



Nitin Yadav and Sebastian Sardina * 
RMIT University, Melbourne, Australia. 



Abstract. We propose a variant of Alternating-time Temporal Logic (ATL) 
grounded in the agents' operational know-how, as defined by their libraries of 
abstract plans. Inspired by ATLES, a variant itself of ATL, it is possible in our 
logic to explicitly refer to "rational" strategies for agents developed under the 
Belief-Desire-Intention agent programming paradigm. This allows us to express 
and verify properties of BDI systems using ATL-type logical frameworks. 

Keywords: Agent Programming, Reactive plans, ATL, Model Checking. 



1 Introduction 

The formal verification of agent-oriented programs requires logic frameworks capable 
of representing and reasoning about agents' abilities and capabilities, and the goals they 
can feasibly achieve. In particular, we are interested here in programs written in the fam- 
ily of Belief-Desire-Intention (BDI) agent programming systems laJd llal . a popular 
para digm for building multi-agent systems. Traditional BDI logics based on CTL (e.g., 
[17]) are generally too weak for representing ability; their success has primarily been 
in defining "rationality postulates," i.e., constraints on rational behaviour. Further, such 
logics do not encode agents' capabilities (as represented by their plan libraries) and 
thereby leave a sizable gap between agent programs and their formal verification. 

Recent work (e.g., yj, y, |9[]) has better bridged the gap between formal logic and 
practical programming by providing an axiomatisation of a class of models that is de- 
signed to closely model a programming framework. However, this is done by restricting 
the logic's models to those that satisfy the transition relations of agents' plans, as de- 
fined by the semantics of the programming language itself. In such a framework, it is 
not possible to reason about the agent's know-how and what the agent could achieve if 
it had specific capabilities. It is also not possible to reason about coalition of agents. 

Our aim thus is to define a framework, together with model checking techniques, 
that will allow us to speculate about a group of agents' capabilities and what they can 
achieve with such capabilities under the BDI paradigm, which enables abstract plans 
written by programmers to be combined and used in real-time under the principles of 

This requires the ability to represent capabilities directly in our logic. To that end, 
we adapt ATLES, a version of ATL (Alternating-time Temporal Logic) [3] with Ex- 
plicit Strategies B20I1 . to our purpose. ATL is a logic for reasoning about the ability of 
agent coalitions in multi-player game structures. This is achieved by reasoning about 
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strategies (and their success) employed by teams of agents: ((A)) <p expresses that the 
coalition team of agents A h as a joint strategy for guaranteeing that the temporal prop- 



erty ip holds. Walt her et al.l 112011 . standard ATL does not allow agents' strategies to be 
explicitly represented in the syntax of the logic. They thus rectified this shortcoming 
by defining ATLES, which extends ATL by allowing strategy terms in the language: 
((A)) pip holds if coalition A has a joint strategy for ensuring <p, when some agents are 
committed to specific strategies as specified by so-called commitment function p. 

In this paper, we go further and develop a framework — called BDI-ATLES — in 
which the strategy terms are tied directly to the plans available to agents under the no- 
tion of practical reasoning embodied by the BDI paradigm ||6|,Ll8|]: the only strategies 
that can be employed by a BDI agent are those that ensue by the ( rational) execution of 
its predefined plans, given its goals and beliefs. The key construct ((A)) u _ e ip in the new 
framework states that coalition A has a joint strategy for ensuring ip, under the assump- 
tions that some agents in the system are BDI-style agents with capabilities and goals as 
specified by assignments cj and q, respectively. For instance, in the Gold Mining domain 
from the International Agent Contest!^ one may want to verify if two miner agents pro- 
grammed in a BDI language can successfully collect gold pieces when equipped with 
navigation and communication capabilities and want to win the game, while the oppo- 
nent agents can perform any physically legal action. More interesting, a formula like 
((A))$Q(p D ({A))u ie <p can be used to check whether coalition A has enough know-how 
and motivations to carry out a task ip that is indeed physically feasible for the coalition. 

We observe that the notion of "rationality" used in this work is that found in the 
literature on BDI and agent programming, rather than that common in game-theory 
(generally captured via solution concepts). As such, rationality shall refer from now on 
to reasonable constraints on how the various mental modalities — e.g., beliefs, intention, 
goals — may interact. In particular, we focus on the constraint that agents select actions 
from their know-how in order to achieve their goals in the context of their beliefs. 

Finally, we stress that this work aims to contribute to the agent-oriented programing 
community more than to the (ATL) verification one. Indeed, our aim is to motivate the 
former to adopt well-established techniques in game-theory for the effective verification 
of their "reactive" style agent programs. 



2 Preliminaries 

2.1 ATL/ATLES Logics of Coalitions 

Alternating-time Temporal Logic (ATL) [3] is a logic for reasoning about the ability of 
agent coalitions in multi-agent game structures. ATL formulae are built by combining 
propositional formulas, the usual temporal operators — namely, O (" m the next state"), 
□ ("always"), O ("eventually"), and U ("strict until") — and a coalition path quantifier 
((A)) taking a set of agents A as parameter. As in CTL, which ATL extends, temporal 
operators and path quantifiers are required to alternate. Intuitively, an ATL formula 
((A)) <p, where A is a set of agents, holds in an ATL structure if by suitably choosing 
their moves, the agents in A can force </> true, no matter how other agents happen to 
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move. The semantics of ATL is defined in so-called concurrent game structures where, 
at each point, all agents simultaneously choose their moves from a finite set, and the next 
state deterministically depends on such choices. More concretely, an ATL structure is 
a tuple M = (A, Q, V,Act, d, V, a), where A = {1, . . . , k} is a finite set of agents, 
Q is the finite set of states, V is the finite set of propositions, Act is the set of all 
domain actions, d : A x Q M> 2 Act indicates all available actions for an agent in a 
state, V : Q !->• 2 V is the valuation function stating what is true in each state, and 
a : Q x Acf'" 4 ' H> Q is the transition function mapping a state q and a joint-move 
a G 'D(q) — where V(q) = x|^<i(i, q) is the set of legal joint-moves in q — to the 
resulting next state q'. 

A path X = qoqi ■ ■ ■ in a structure M. is a, possibly infinite, sequence of states such 
that for each i > 0, there exists a joint-move ai G T>(qi) for which a(qi,ai) — 
We use X[i] ~ qi to denote the i-th state of A, A to denote the set of all paths in Ai, 
and A(q) to denote those starting in q. Also, |A| denotes the length of A as the number 
of state transitions in A: |A| = £ if A = go<7i ■ • ■ Qe , and |A| = oo if A is infinite. When 
< i < j < |A|, then X[i, j] = qiqi+i ■ ■ ■ qj is the finite subpath between the i-th and 
j-th steps in A. Finally, a computation path in M is an infinite path in A. 

To provide semantics to formulas ((-))<p, ATL relies on the notion of agent strategies. 
Technically, an ATL strategy for an agent agt is a function f agt : Q + n- Act, where 
fagr(Xq) G d(agt, q) for all Xq G Q + , stating a particular action choice of agent agt at 
path Xq. A collective strategy for group of agents A C A is a set of strategies Fa — 
{fagt I a gt G .4} providing one specific strategy for each agent agt G A. For a collective 
strategy Fa and an initial state q, it is not difficult to define the set out(q, Fa) of all pos- 
sible outcomes of Fa starting at state q as the set of all computation paths that may 
ensue when the agents in A behave as prescribed by Fa, and the remaining agents 
follow any arbitrary strategy 1I3I. I20I1 . The semantics for the coalition modality is then 
defined as follows (here is a path formula, that is, it is preceded by Q □, or U, and 
M, X |= 4> is defined in the usual way J2I]): 



Ai,q \= ((A)) <f> iff there is a collective strategy Fa such that for all computations 
A G out(q, Fa), we have M, X |= (f>. 



The coalition modality only allows for implicit (existential) quantification over 
strategies. In some contexts, though, it is important to refer to strategies explicitly in 
the language, e.g., can a play er win the ga me if the opponent plays a specified strategy? 



To address this limitation, Walth er et alj 112011 proposed ATLES, an extension of ATL 
where the coalition modality is extended to ((A)) p , where p is a commitment function, 
that is, a partial function mapping agents to so-called strategy terms. Formula ((A)) p <p 
thus means that "while the agents in the domain of p act according to their commit- 
ments, the coalition A can cooperate to ensure <j) as an outcome." 

The motivation for our work stems from the fact that ATLES is agnostic on the 
source of the strategic terms: all meaningful strategies have already been identified. In 
the context of multi-agent systems, it may not be an easy task to identify those strate- 
gies compatible with the agents' behaviors, as those systems are generally built using 
programming frameworks [5] that are very different from ATL(ES). 



4 



Nitin Yadav and Sebastian Sardina 



2.2 BDI Programming 

The BDI agent-oriented programming paradigm is a popular and successful approach 
for building agent systems, with roots in philosophical work on rational action [6] and 
a plethora of programming languages and systems available, such as Jack, Jason, 



A typical BDI agent continually tries to achieve its goals (or desires) by selecting an 
adequate plan from its plan library given its current beliefs, and placing it into the in- 
tention base for execution. The agent's plan library 77 encodes the standard operational 
knowledge of the domain by means of a set of plan-rules (or "recipes") of the form 
plan a is a reasonable plan to adopt for achieving ip when (context) condition (f> 
is believed true. For example, walking towards location x from y is a reasonable strat- 
egy, if there is a short distance between x and y (and the agent wants to be eventually at 
location x). Conditions and ip are (propositional) formulas talking about the current 
and goal states, respectively. Though different BDI languages offer different constructs 
for crafting plans, most allow for sequences of domain actions that are meant to be 
directly executed in the world (e.g., lifting an aircraft's flaps), and the posting of (in- 
termediate) sub-goals lip (e.g., obtain landing permission) to be resolved. The intention 
base, in turn, contains the current, partially executed, plans that the agent has already 
committed to for achieving certain goals. Current intentions being executed provide a 
screen of admissibility for attention focus |6J]. 

Though we do not present it here for lack of space, most BDI-style programming 
languages come with a clear single-step semantics basically realizing [ 18]'s execution 
model in which ( rational ) behavior arises due to the execution of plans from the agent's 
plan library so as to achieve certain goals relative to the agent's beliefs. 



Here we develop an ATL(ES)-like logic that bridges the gap between verification frame- 
works and BDI agent-oriented programming languages. The overarching idea is for BDI 
programmers to be able to encode BDI applications in ATL in a principled manner. 

Recall that ATL(ES) uses strategies to denote the agent's choices among possible 
actions. For a BDI agent these strategies are implicit in her know-how. In particular, we 
envision BDI agents defined with a set of goals and so-called capabilities II 71 11611 . Gen- 
erally speaking, a capability is a set/module of procedural knowledge (i.e., plans) for 
some functional requirement. An agent may have, for instance, the Navigate capability 
encoding all plans for navigating an environment. Equipped with a set of capabilities, a 
BDI agent executes actions as per plans available so as to achieve her goals, e.g., explor- 
ing the environment. In this context, the BDI developer is then interested in what agents 
can achieve at the level of goals and capabilities. Inspired by ATLES, we develop a 
logic that caters for this requirement without departing much from the ATL framework. 

In this work, we shall consider plans consisting of single actions, that is, given 
BDI plan for the form <fi[a]ip, the body of the plan a consists of one primitive action. 
Such plans are akin to those in the GOAL agent programming language [ 1 1], as well as 
universal-plans [19], and reactive control modules [4]. Let n^ ct be the (infinite) set of 
all possible plan-rules given a set of actions Act and a set of domain propositions P. 




3 BDI- ATLES : ATL for BDI Agents 
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3.1 BDI-ATLES Syntax 

The language of BDI-ATLES is denned over a finite set of atomic propositions V, 
a finite set of agents A, and a finite set of capability terms C available in the BDI 
application of concern. Intuitively, each capability term c e C (e.g., Navigate) stands 
for a plan library 7T C (e.g., yjNavigate-j ^ s usua i^ a coalition is a set A C A of agents. 
A capability assignment uj consists of a set of pairs of agents with their capabilities of 
the form (agt : C agt }, where agt G A and C agl C C. A goal assignment q, in turn, 
defines the goal base (i.e., set of propositional formulas) for some agents, and is a set 
of tuples of the form (agt : G agt ), where agt e A and G ag , is a set of boolean formulas 
over V . We use A u to denote the set of agents for which their capabilities are defined 
by assignment oj, that is, A u = {agt \ (agt : C agt ) Go;}. Set A e is defined analogously. 

The set of BDI-ATLES formulas is then exactly like that of ATL(ES), except that 
coalition formulas are now of the form ((A)) Utg (p, where ip is a path formula (i.e., it is 
preceded by Q □, or U), A is a coalition, and lj and g range over capability and goal 
assignments, respectively, such that = A e . Its intended meaning is as follows: 

((A))u, e (p expresses that coalition of agents A can jointly force temporal con- 
dition tp to hold when BDI agents in A u (or A g , since A e = Au,) are equipped 
with capabilities as per assignment uj and (initial) goals are per assignment g. 

Notice that we require, in each coalition (sub)formula, that the agents for which 
capabilities and goals are assigned to be the same. This enforces the constraint that BDI- 
style agents have both plans and goals. Hence, a formula of the form ((A))^ii ai .i^\\\<p 
would not be valid, as agent a\ has one goal (namely, to bring about 7), but its set 
of plans is not defined — we cannot specify what its rational behavior may be. This 
contrasts with formula ((-^)){/ai:0)},{<oi:{7}>}^> a va hd formula in which agent a\ is 
assumed to have no plans (i.e., agent has empty know-how) and one goal. 

Example 1. Consider the following simplified instance of the gold mining domain with 
three locations A, B and C, a gold piece o at location C, the depot located at B (rect- 
angle location), and two players Ag (BDI agent) and En (enemy): 

B 

A (En) [Ag] C 

Players can move LEFT/RIGHT, PICK/DROP gold, or remain still by executing spe- 
cial action NOOp. Proposition X Y , where X e {Ag, En} and Y e {A, B, C}, encodes 
that player X is at location Y; whereas propositions Ga, Gb, Gc, G&g, and Gen de- 
note that the gold is at location AIBIC or being held by agent Ag/En, respectively. The 
depot is assumed to be always at B and hence is not represented explicitly. 

The winning condition for player Ag is ^win— Gb A Ag B : the player wins when 
collocated with gold at the depot. 

Among the many capabilities available encoding the know-how information of the 
domain, we consider the following three. The Collect capability includes plans to pick 
gold, such as Ag c AGc [pick] Gb' if gold needs to be at B and agent is at C, where 
there is indeed gold, then execute the PICK action. Similarly, capability Deposit con- 
tains plans like G*Ag A Ag B [DROP]Gs, for example, to allow dropping of gold at the 
desired location. Lastly, capability Navigate has plans for moving around, such as 
Ag c [left] Ag B to move left from location C to (desired destination) B. □ 
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(a) A section of the BDI-ATLES alternating model, (b) Traces Xf and resultant from 

strategies /yL and /jL, respectively 
Fig. 1. A fragment of a Gold domain model and a picture showing rational traces and strategies. 
Actions LEFT, RIGHT, PICK, DROP, and NOOp are abbreviated with their first letter. 

The remaining of the section involves providing the right interpretation to such for- 
mulas, under the assumption that agents act rationally as per the BDI paradigm. 

3.2 BDI-ATLES Semantics 

A BDI-ATLES concurrent game structure is a tuple A^{A, Q, V .Act, d, V, a, 0), with: 

- A, Q, V, Act, d, V and a are as in ATL(ES). 

- There is a distinguished dummy action NOOp e Act such that NOOp G d agt (q) and 
a(q, (noOp, . . . , NOOp)) = q, for all agt G A and q <E Q, that is, NOOp is always 
available to all agents and the system remains still when all agents perform it. 

- Capability function : C M> J-(n^ cl ) maps capability terms to their (finite) set of 
plans. (Here, ^(X) denotes the set of all finite subsets X.) 

Example 2. Figure |l(a)| shows a partial model for the gold game. The game starts at 
state qo, with players Ag and En located at B and A, resp., and gold present at C. 
From there, player Ag has a winning strategy: reach the gold earlier and deposit it in 
the depot. This can be seen in path go 91929394- However, this is possible only when the 
agent Ag is indeed equipped with all three capabilities. If, on the other hand, the agent 
lacks capability Collect, for instance, then player En may actually manage to win the 
game, as evident from the path 9o9i95969798- □ 

BDI-ATLES models are similar to ATLES ones, except that capability, rather than 
strategy term, interpretations are used. In a nutshell, the challenge thus is to characterize 
what the underlying "low-level" ATL strategies for agents with certain capabilities and 
goals are. We call such strategies rational strategies, in that they are compatible with 
the standard BDI rational execution model Hill : they represent the agent acting as per 
her available plans in order to achieve her goals in the context of her beliefs. 

So, given an agent agt £ A, a plan-library 77, and a goal base Q, we define S^'g 
to be the set of standard ATL strategies for agent agt in M. that are rational strategies 
when the agent is equipped with plan-library 77 and has Q as (initial) goals, that is, 
those ATL strategies in which the agent always chooses an action that is directed by 
one of its available plans in order to achieve one of its goals in the context of its current 
beliefs. The core idea behind defining set £"jf'g is to identify those "rational traces" in 
the structure that are compatible with the BDI deliberation process in which the agent 
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acts as per her goals and beliefs. Traces just generalize paths to account for the actions 
performed at each step, and are hence of the form A + = qo ci\ q\ ■ ■ ■ at qg such that 
lo^q " • • Q£ is a (finite) path. Rational strategies, then, are those that only yield rational 
traces. Technically, we define rational traces in three steps. First, we define a goal- 
marking function g(X + ,i) denoting the "active" goal base of the agent at the i-th stage 
of trace A + . Basically, a goal-marking function keeps track of the goals that the agent 
has already achieved at each stage in a trace. Second, we define Exec(<f>[a] tp, g, A + ) 
as the set of indexes (i.e., stages) in trace A + where the plan <p[a]ijj may have been 
executed by the agent: the plan's precondition tj> was true, ip was an active goal of the 
agent (as directed by goal-marking function g), and a was indeed performed. Finally, 
we say a trace A + is deemed "rational" if at every moment in the run the agent executed 
one of its plans. That is, for every index i, it is the case that i e Exec agt ((j)[a\ip, g, A + ), 
for some plan <p[a]ip in her know-how library. Finally, we use g to denote the set 
of all ATL strategies whose executions always yield rational traces. The laborious, and 
arguably boring, technical details of all this can be found in the Appendix. 

Example 3. Figure [T(b)| depicts two possible traces A^ and (for agent Ag) com- 
patible with strategies and /jL, resp. Trace A+ is due to the agent executing ac- 
tions as per its applicable plans, as evident from the plan labeling. For example, at 
state qi, the agent is in a gold location and hence executes the pick action as per plan 
Ag c AGc [pick]Gb- Consequently the strategy is rational, as it yields rational trace 
A+. Trace on the other hand does not obey the BDI rationality constraints (e.g., the 
agent remains still in location B, despite an applicable plan being available). □ 

Assuming that set 2J"jf'g of rational strategies has been suitably defined, we are 
ready to detail the semantics for formulas of the form ((A)) UtB (f. Following ATLES we 
first extend the notion of a joint strategy for a coalition to that of joint strategy under 
a given capability and goal assignment. So, given a capability (goal) assignment lo (g) 
and an agent agt G A u (agt £ A e ), we denote agt's capabilities (goals) under ui (q) 
by u;[agf] (g[agt\). Intuitively, an (ui, q) -strategy for coalition A is a joint strategy for 
A such that ( i) agents in A D A^ only follow "rational" (plan-goal compatible) strate- 
gies as per their ^-capabilities and g-goals; and (b) agents in A\Au> follow arbitrary 
strategies. Formally, an (uj, o) -strategy for coalition A (with A^ = A e ) is a collective 
strategy Fa for agents A such that for all f agt G Fa with agt E An A u , it is the case 
that f ag , £ £"ng, where 77 = U c£w [„,,,] (9(c) and Q = g[agt\. Note no requirements are 
asked on the strategies for the remaining agents A\A U , besides of course being legal 
(ATL) strategies. Also, whereas ATLES p-strategies are collective strategies including 
all agents in the domain of commitment function p, our (ui, q) -strategies are collective 
strategies for the coalition of concern only. This is because commitment functions in- 
duce deterministic agent behaviors, whereas capabilities and goals assignments induce 
non-deterministic ones. We will elaborate on this issue below. 

Using the notions of (lu, q) -strategies and that of possible outcomes for a given 
collective strategy from ATL (refer to function out(-,-) from Preliminaries), we are 
now able to state the meaning of BDI- ATLES (coalition) formulas|l 

2 As with ATL(ES), <p ought to be a path formula and is interpreted in the usual manner. We 
omit the other ATL-like cases for brevity; see 12011 . 
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Ai, q \= ((A)) u , e <p iff there is a (u, £>)-strategy Fa such that for all (co, g) -strategies 
Fa„\a f° r \ A, it is the case that M, A |= ip, for all paths A G out(q, Fa U F AiiJ \a)- 

Intuitively, Fa stands for the collective strategy of agents A guaranteeing the satisfac- 
tion of formula (p. Because Fa is a (u>, g) -strategy, some agents in A — those whose 
capabilities and goals are defined by uj and g, resp. — are to follow strategies that corre- 
spond to rational executions of its capabilities. At the same time, because other agents 
outside the coalition could have also been assigned capabilities and goals, the chosen 
collective strategy Fa needs to work no matter how such agents (namely, agents Aw \A) 
behave, as long as they do it rationally given their plans and goals. That is, Fa has to 
work with any rational collective strategy \a- Finally, the behavior of all remaining 
agents — namely those in A \ (A U Aw) — are taken into account when considering all 
possible outcomes, after all strategies for agents in A U Aw have been settled. 

While similar to ATLES coalition formulas ((A)) p ip, BDI-ATLES coalition formu- 
las {{A))u, e <p differ in one important aspect that makes its semantics more involved. 
Specifically, whereas commitment functions p prescribe deterministic behaviors for 
agents, capabilities and goals assignments yield multiple potential behaviors for the 
agents of interest. This nondeterministic behavior stems from the fact that BDI agents 
can choose what goals to work on at each point and what available plans to use for 
achieving such goals. Technically, this is reflected in the strategies for each agent 
in (Aoj \ A) — those agents with assigned capabilities and goals but not part of the 
coalition — cannot be (existentially) considered together with those of agents in A or 
(universally) accounted for via the possible outcomes function ouf(-, •), as such func- 
tion puts no rationality constraints on the remaining (non-committed) agents. Thus, 
whereas agents in An Aw are allowed to select one possible rational behavior, all ra- 
tional behaviors for agents in (A u \ A) need to be taken into consideration. 

We close this section by noting an important, and expected, monotonicity property 
of BDI-ATLES w.r.t. changes in the goals and plans of agents. 

Proposition 1. |= ((A)) UtB tp D {{A')) u > >e < ip holds, provided that: 

- A C A', that is, the coalition is not reduced; 

- uj[agt] C uj'[agt] and g[agt] C g'[agt], for all agt G Aw H A, that is, the goals and 
capabilities of those BDI agents in the coalition are not reduced; and 

- Aw \ A C Aw' \ A', that is, the set ofnon BDI agents outside the coalition is not 
reduced (but could be new BDI agents outside the coalition); 

- Lj'[agt] C uj[agt] and g'[agt] C g[agt], for all agt G Aw \ A that is, the goals and 
capabilities of those BDI agents outside the coalition are not augmented. 

Informally, augmenting the goals/plans of agents in a coalition does not reduce the 
ability of agents. This is because a collective (co, g)-strategy for coalition A to bring 
about a formula would still work if more goals and plans are given to the agents in the 
coalition (second condition). Observe, on the other hand, that augmenting the goals or 
plans of those agents outside the coalition may yield new behavior that can indeed in- 
terfere with the coalition's original abilities (last condition). This even includes turning 
BDI agents into non BDI agents (third condition). Of course, as in ATL, enlarging the 
coalition does not reduce ability (first condition). 
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foreach ip' in Sub(ip) w.r.t. M = (A,Q,V,Act,d,V,a,&) do 
case <p' = p : [ip']m = V(p); 
case ip' = ^6 : [<p']m = ([True]m \ [0]m)\ 
case jp'^ftVfc: [<p']m = [6i]m U [0 2 ]a<; 
case v?' = <(A» w , e O^ : [<p']m = vw(Pr*(A,w,©, [6>] M J n [q}) ; 
case = {{A)) u , B u9 : p = [TrueJa^t = [9]x e ; 
while p g r do p = r; r = Pre(A, w, 0, p) n [6>].M e od; 
[^V =ws(pn[e]); 
case = ({A)) u] , e 6 1 U6 2 : p = [FalseJai^t = [6> 2 ].M e ; 
while r do p = pUr; t = Pre(A, w, 0, p)n[^i]x e od; 

[^V =»(pn[ s ]) ; 

od; 

return [w'Iai; 

Fig. 2. BDI-ATLES symbolic model checking. 



4 BDI-ATLES Model Checking 

Given a BDI-ATLES model .M and a formula <p, the model checking algorithm for 
BDI-ATLES computes the set of states in M. that satisfy <p. To that end, the algorithm 
has to take into account the rational choices of each BDI agent, that is, those choices 
that are the consequence of the agent's goals and capabilities specified by functions g 
and w in formulae of the form ((A))u, e <p. Roughly speaking, the algorithm restricts, at 
each step, the options of BDI agents to their applicable plans. We start by extending the 
model M to embed the possible goals (based on the goal assignment) of BDI agents 
into each state, and then then discuss the model checking algorithm and its complexity. 

So, given a BDI-ATLES model M = {A, Q, V,Act, d, V, a, 0) and a goal assign- 
ment q, the goal-extended model is a model M e =(A, Q e , V,Act, d e , V e ,<r B , 0), where: 

- Q e C Q x JJ agteAs is the set of extended states, now accounting for the 
possible goals of BDI agents. When q B = (q,gi, ... , g\A e \) S Q e , where q e Q 
and gi C g[agt^\, is an extended state, we use ws(q e ) — q and gliagt^qg) = gi 
to project M's world state and agt^s goals. To enforce belief-goal consistency 
we require no agent ever wants something already true: there are no q e e Q e , 
agt G A e , and formula 7 such that V(ws(q e )) |= 7 and 7 G gl{agt, q e ). 

- V e (q e ) = V(ws(q e )), for all q e G Q e , that is, state evaluation remains unchanged. 

- d e (agt, q e ) = d(agt, ws(q e )), that is, physical executability remains unchanged. 

- (r e (q e ,a) = (q',g[, ...,g' lA J, where q' = a(ws(q B ),a) and g[ = gl(agt i ,q e )\ 
{7 I 7 G gl(agt i ,q e ),V(q') \= 7}, is the transition function for the extended model. 

Model A4 e is like M. though suitably extended to account for agents' goals under 
the initial goal-assignment q. Observe that the transition relation caters for persistence 
of goals as well as dropping of achieved goals. Indeed, the extended system will never 
evolve to an (extended) state in which some agent has some true fact as a goal. Hence, 
the transition relation is well-defined within Q e states. More interesting, the extended 
model keeps the original physical executability of actions and, as a result, it accom- 
modates both rational and irrational paths. However, it is now possible to discriminate 
between them, as one can reason about applicable plans in each state. Finally, it is not 
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difficult to see that the extended model is, in general, exponentially larger than the orig- 
inal one with respect to the number of goals max fli;fe _4(|f?[agf]|) and agents \A e \. 

As standard, we denote the states satisfying a formula <p by [pi]. When the model is 
not clear from the context, we use [p\m to denote the states in M. that satisfy the for- 
mula <p. We extend ws(-) projection function to sets of extended states in the straight- 
forward sense, that is, ws(S) = [J ql£ s{ ws (l)}- Thus, wi([<p]x e ) denotes the set of 
all world states in M that are part of an extended state in A4 g satisfying the formula 
p. Also, IqI denotes the set of extended states where the agents' goals are as per goal 
assignment g; formally, (gj = {q \ q E Q e ,Vagt G A e : gl(agt,q) = g[agt}}. 

Figure [2] shows the model checking algorithm for BDI-ATLES. It is based on the 
symbolic model checking algorithm for ATL [3] and ATLES l2(ill . The first three cases 
are handled in the same way as in ATL(ES). To check the BDI-ATLES coalition formu- 
lae ((Af/u.Qtp, we extend the model as above (relative to the formula's goal assignment 
g), and then check the plain ATL coalition formula ((A)) ip in such extended model. 
Note that only the set of states having the goals as per the initial goal assignment are 
returned — all agents' initial goals are active in the first state of any rational trace. 

Unlike standard ATL model checking, we restrict the agents' action choices as per 
their capabilities. This is achieved by modifying the usual pre-image function Pre(-) to 
only take into account actions resultant from agents' applicable plans. More concretely, 
Pre(A,oj,0, p) is the set of (extended) states from where agents in coalition A can 
jointly force the next (extended) state to be in set p no matter how all other agents (i.e., 
agents in _4\ ^4) may act and provided all BDI-style agents (i.e., agents with capabilities 
defined under lj and 0) behave as such. Formally: 

Pre(A,w,0,p) = {q \ Vi e A, 3a t € d+(i,q,u,0), 

Vj G A\A,Vdj G d+(j,q,uj,0):a g (q, (oi, . . .,a\ A \))Gp}, 

where auxiliary function d+ (agt, q,w,&) denotes the set of all actions that an agent agt 
may take in state q under capabilities as per defined in ui and 0: 



d+(agt,q,uj,0) 



d e (agt,q) if agt A u 

d g (agt,q)n d Bm {agt,q, [j 0(c)) if agt G Au. 



An action belongs to set d+ (agt, q,u,0) if it is physically possible (i.e., it belongs 
to d g (agt, q)), and BDI-rational whenever the agent in question is a BDI agent. To cap- 
ture the latter constraint, set d BD1 (agt, q, 77) is defined as the set of all rational actions 
for agent agt in (extended) state q when the agent is equipped with the set of plans 77: 



d SDl (agt,q,n) 



f {a | 0[c# G A(agt, q, 77)} if A(agt, q, 77) / 
1 {noOp} otherwise 



where A(agt, q, 77) = {(/'[aji/' G 77 | V(q) \= (f), 7 € gl(agt, q), ip |= 7} is the set of all 
applicable plans in 77 at state q. So, summarising, function Pre(-, ■,-,■) is an extension 
of the standard ATL Pre(-) function in which the agents that have goals and capabilities 
defined — the BDI agents — do act according to those goals and capabilities. 
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It is clear that the modified version of Pre(-) function does not alter the complexity 
of the underlying ATL-based algorithm. In fact, the variation is similar to that used for 
model checking ATLES, except that the action filtering does not come from strategy 
terms, but from agent plans. This means that the algorithm runs in polynomial time 
w.r.t. the size of model M g (which is exponential w.r.t. the original model M). 

Theorem 1. Model checking a BDI-ATLES formula ((A^^^ip (against a model M.) 
can be done in exponential time on the number of agents \A\ and goals max a6 ^4 (| g[a] |). 

Of course, should we have included agents' goals explicitly in models (rather than 
using a succinct representation), as done with intentions in ATL+intentions (ATLI) [15], 
the model checking problem would retain ATL's polynomial complexity. The same 
would apply if one just generalized ATLES to explicitly require all rational-strategies 
be part of the model. The fact is, however, that generating such rational strategies by 
hand (to include them in models) will be extremely involved, even for small problems. 
In addition, our approach decouples agent's mental attitudes from the physical ATL-like 
model, and enables reasoning at the level of formulae without changing the model. 

We shall note that the exponentiality may not show up in certain applications. In 
many cases, for example, one is interested in just one BDI agent acting in an environ- 
ment. In that case, only such agent will be ascribed goals and capabilities. Since it arises 
due to agents with goals, the exponential complexity would therefore only be on the 
number of goals for such agent. Similarly, in situations where all agents have a single 
goal to achieve (e.g., to pick gold), the model checking would then be exponential on 
the number of BDI agents only. In the next section we shall provide one interpretation 
of goals for which the model checking problem remains polynomial. 

5 BDI-ATLES with Maintenance Goals 

So far, we have worked on the assumption that agents have a set of "flat" achievement 
goals, goals that the agent needs to eventually bring about. One can however consider 
alternative views of goals that could suit different domains. In particular, we have con- 
sidered achievement goals with priorities and repetitive/reactive maintenance goals. In 
the first case, the framework can be easily generalized to one in which goals can be 
prioritized without an increase in complexity [? ]. 

A more promising case arises when goals are given a maintenance interpretation, 
that is, (safety) properties that ought to be preserved temporally. For example, a Mars 
robot has the goal to always maintain the fuel level above certain threshold. We focus 
our attention on the so-called repetitive or reactive maintenance goals [TloL fT^h : goals 
that ought to be restored whenever "violated." Should the fuel level drop below the 
threshold, the robot will act towards re-fueling. This type of goals contrast with proac- 
tive maintenance goals [12], under which the agent is expected to proactively avoid 
situations that will violate the goal. The fact is, however, that almost all BDI platform — 
like Jack, JASON, and Jadex — only deal with the reactive version, thus providing a 
middle ground between expressivity and tractability. 

Technically, to accommodate maintenance goals within BDI-ATLES, one only 
needs to do a small adaptation of the semantics of the logic so that goals are not 
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dropped forever once satisfied, but "re-appear" when violated. We refer to this alter- 
nate version of our logic as BDI-ATLES M Of course, the model checking algorithm 
discussed above also needs to be slightly adapted to deal with the new goal semantics. 
Interestingly, one only needs to adapt the definition of a goal-extended model M e by 
re-defining components Q g and cr g (q e , a); see [? ] for details. 

Theorem 2. Model checking in BDI-ATLES M can be done in polynomial time (w.r.t. 
the model and the formula). 

Hence, for (reactive) maintenance goals, we retain ATL(ES) polynomial complex- 
ity0Of course, this bound is tight, as BDI-ATLES M subsumes ATL (just take lu = ip — 
in every coalition formula) and model checking ATL is PTIME-complete J^. 



6 Discussion 

We have developed an ATL-like logic that relates closely to the BDI agent-oriented pro- 
gramming paradigm widely used to implement multi-agent systems. In the new logic, 
the user can express the capability of agents equipped with know-how knowledge in 
a natural way and can reason in the language about what agents can achieve under 
such capabilities. Besides the general framework with standard achievement goals, we 
argued that one could instead appeal to goals with priorities or a special type of main- 
tenance goals. We provided algorithms for model checking in such a framework and 
proved its (upper-bound) complexity in the various cases. Overall, we believe that this 
work is a first principled step to bring together two different fields in the area of multi- 
agent systems, namely, verification of strategic behaviour and agent programming. 

The framework presented here made a number of assumptions requiring further 
work. Due to valuation function V in a structure, all agents are assumed to have full 
shared observability of the environment. This is, of course, a strong assumption in 
many settings. We considered here basicallyreactive plans, akin to the language of 
Goal ifTll]. certain classes of 2APL/3APL Q8, 13], reactive modules fl, and univer- 



sal plans [19]. We would like to explore the impact of allowing plan bodies having 
sequences of actions, and more importantly, sub-goaling, as well as the possibility of 
agents imposing (new) goals to other agents, via so-called BDI messages. Also, in the 
context of complex plan bodies, one could then consider both a linear as well as inter- 
leaved execution styles of plans within each agent (for its various goals). Most of these 
issue appear to be orthogonal to each other, and hence can be investigated one by one. 
With the core framework laid down, our next efforts shal focus on the above issues, as 
well as proving whether the complexity result provided in Theorem[T]is tight. 

We close by noting that, besides ATLES, our work has strong similarities and moti- 
vations to those on plausibility ffl and intention lfl5ll reasoning in ATL. Like ATLES, 
however, those works are still not linked to any approach for the actual development of 
agents, which is the main motivation behind our work. Nonetheless, we would like to 
investigate how to integrate plausibility reasoning in our logic, as it seems orthogonal to 
that of rational BDI-style behavior. Indeed, the plausibility approach allows us to focus 
the reasoning to certain parts of an ATL structure using more declarative specifications. 



3 Note the complexity of model checking ATLES is known only for memoryless strategies lioll . 
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A Rational strategies 

Given a plan library 77 and an initial goal base Q for an agent agt in structure M., we 
are to characterize those strategies for agt within M. that represent rational behaviors: 
the agent tries plans from 77 in order to bring about its goals Q given its beliefs ll6, 181. 



While technically involved, the idea to define set S 1 ^ g is simple: first identify those 
paths in M that display rational behavior for the agent; second consider rational strate- 
gies those that will always yield rational paths. We do this in three steps. First, we 
identify constraints on how the goals of an agent can evolve in a path. Second, we de- 
fine what it means for a plan to be tried by an agent in a path. Third, we identify those 
(rational) paths that result from an agent's deliberation process. 

Before we start, we extend the notion of paths to account for the actions performed. 
A trace is a finite sequence of alternating states and actions A + = q$ a\ q\ ■ ■ ■ at qt such 
that qoq q ■ ■ ■ qi is a (finite) path in M.. As with paths, we use X + [i] and X + (i) to denote 
the i-th state and the i-th action flj, respectively, in trace A + . The length |A + | of a 
trace is the number of actions on it; hence it matches the length of the underlying path. 

Example 4. Figure |l(b)| depicts two possible traces A^ and A J for agent Ag that are 
compatible with strategies and /L, respectively. The agent has tpwiN as its initial 
goal, and is equipped with capabilities Navigate, Collect and Deposit. Trace A^ is re- 
sultant from the agent executing actions as per its applicable plans, as evident from the 
plan labeling. For example, at the state q\, the agent is collocated with the gold, and so 
executes the PICK action as per plan Ag c AGc[pick]Gb- Consequently, strategy /L 
is rational as it yields rational trace X\. On the other hand, trace X\ does not obey the 
rationality constraints: the agent executes the NOOp action at qo, although there is in 
fact an applicable plan available. 

Technically, A + is a rational trace for an agent agt G A equipped with a plan-library 
77 and having an initial goal base Q, if for all i < | A + 1, either: 

- there is an applicable plan (f>[a]i/] £ H (for some goal 7 G gg(X + , i)) such that 
i e Exec agt (^[a]ip,gg,X + ); or 

- there is no applicable plan relative to the goal-marking gg and A + (i + 1) = NOOp. 

We denote Cng tne set °f a ^ rational traces for agent agt with library 77 and goal 
base Q. Also, to link traces and strategies, we define GetTrace(X, f agt ) to be the partial 
function that returns the trace induced by a path A and a strategy f agt , if any. Formally, 
GetTrace(X, f agl ) = qoa\ q% . . . am q \ iff for all k < |A|: (i) q^ = X[k]; and (ii) there 
exist a joint-move a e T>(qk) such that a(qk,a) = qu+i and f ag ,(X[0, k}) = a[agt] = 
flfe+i (where a[agt] denotes agent agfs move in joint-move a). 

Finally, the set of rational strategies is defined as follows: 

S n',g = ifagt I \J Xe -jGetTrace(XJ agt ) C Cn,g}i 

where A C A is the set of finite paths in A4. That is, a rational strategy f ag , is one that 
only yields rational traces. 
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Goal evolution in traces BDI agents achieve their goals by means of acting as per 
their plans. In this section, we shall identify what the possible achievement goals [1011 an 
agent may have at each moment in a path, under a blind commitment strategy [ 17] — the 
most common strategy in BDI programming platforms — in which a rationality principle 
states that an agent drops the goals it has already achieved. We discuss maintenance 
goals in the next section as a special case. 

To that end, we make use of so-called goal-marking function gg(X + ,i), for an agent 
with an initial goal base Q, which outcomes the "active" goal base of the agent at mo- 
ment A + [i] in trace A + . As expected, the goal marking function returns the unfulfilled 
subset of initial goals. Formally, 

gg (X+, l ) = { 1 \ 1 eG, (-dj < i) V(A+[j]) h 7>- 

In this setting, the agent picks an active goal and executes a plan for it. If the plan 
fails to achieve the goal, the goal remains active and the agent can either pursue the 
same goal or a different one. On the other hand, if the plan succeeds in bringing about 
the goal, then the achieved goal is removed from the agent's active goal set. 

Plan executions in traces Agents developed under the BDI paradigm are meant to 
execute actions as per the plans/know-how available to them. We shall next define what 
it means for a trace to include an execution of a plan. 

When it comes to selecting plans for execution, there are generally two core notions 
in BDI programming. A plan is relevant if its intended effects are enough to bring about 
some goal of the agent. Technically, given a trace A + and the goal-marking function g, 
we say that a plan-rule <fr [a] tp is relevant at moment A + [i] in the trace, with < i < 
| A + 1, if there exists 7 G <?(A + , i) such that ip \= 7. Furthermore, the plan is applicable 
at A + [i] if it is relevant and its context condition holds true, that is, V(A + [i]) |= (f>. 

So, we identify the moments in which a particular plan may have been executed in 
a trace by an agent. Formally, Exec agt ((j)[a\ip, g, A + ) is the set of indices i such that (i) 
A + (i + 1) = a; and (ii) (f>[a]4> is applicable at X + [i] under goal-marking g. 

Rational traces and rational strategies We now have all the technical machinery to 
define the set of rational traces — those that can be explained by the agent acting as per 
its available plans in order to achieve its goals relative to its beliefs — and the set S^'g 
of rational strategies used to define the semantics of BDI-ATLES. 

Initially, an agent has a set of goals (initial goal base) that she wants to bring about. 
The agent then chooses one goal to work on, and selects an applicable plan for such goal 
from its plan library for execution. If the plan successfully brings about the goal, then 
the agent deems the goal achieved and the plan finished. Traces that can be "explained" 
in this way are referred to as rational traces. 

B Priorities and Maintenance Goals 

So far, we have studied BDI-ATLES under the assumption that agents have a set of "flat" 
achievement goals, goals that the agent needs to eventually bring about. In this section, 



16 



Nitin Yadav and Sebastian Sardina 



we consider our logic under two alternative views of the agent's goal base that could 
suit different domains, namely, achievement goals with priorities and repetitive/reactive 
maintenance goals. Interestingly, the model checking problem for the latter special case 
remains polynomial. 



B.l Goals with priority 

The BDI-ATLES framework defined above treats all goals with equal importance, in 
that a BD1 agent may choose to execute an applicable plan for any active goal. In many 
situations, though, an agent may prefer achieving certain goals first. Thus, a gardening 
agent might want to both pluck fruits and collect dirt, but the former is of higher priority. 
Similarly, a sales agent will generally prefer attending a customer or a phone call to do 
some back-office task. 

To accommodate goal bases with priorities, we extend the notion of goal assignment 
(g) as tuples of the form (agt : (J -1 , -<)}, where agt G A, r is a finite set of boolean 
formulas over V, and -< is a partial-order relation over r specifying the priority over the 
goals. As expected, an initial goal base Q = (Jo, -<o) consists of a set of goals formulas 
Jo and a partial-order priority relation ~< over such set. We use ~< ag , to denote the goal 
priority of agt G A e in goal assignment q. 

Roughly speaking, an agent always acts — by means of an applicable plan — towards 
achieving a goal with highest priority. Should that be impossible (i.e., there is no ap- 
plicable plan for such goals), the agent suspends those goals and tries to act towards 
a next preferred goal, and so on. Note that, being achievement goals, these are meant 
to be dropped whenever achieved. Moreover, goals at any level of importance could be 
achieved as (un-intended) side effects of the agent's own actions or even other agents' 
actions. 

To achieve all this, we update the definition of rational traces and strategies as 
follows. A trace A + is rational for an agent agt G A having an initial goal base 
Q = (Jo, -<o) an d a pl an library JJ, if for all i < |A + either: (here, gg(-, •) is the 
corresponding goal-marking function) 

- there is an applicable plan </> [a] tp G JJ at state A+ [i] for a goal 7 G gg(\ + ,i) such 
that i G Exec agl ((f)[a] ip, gg,X + ), and there is no 7' G gg(X + ,i) with an applicable 
plan at X + [i] and such that 7' -< 7; or 

- there is no applicable plan at foranygoalingg(A + ,i) and X + (i+l) = NOOp. 

Observe that the only difference with the original notion of rational traces is that an 
agent cannot act on a lower priority goal if it has an applicable plan for a higher priority 
goal. All other notions, including goal-marking functions and plan execution in traces 
remain the same. We refer to this new variant as BDI-ATLES P . 

Now, a model checking algorithm for BDI-ATLES p can be obtained by slightly 
modifying the Pre(-, -,-,•) function in the algorithm for BDI-ATLES. Instead of con- 
sidering all applicable plans, the Pre function needs to just take into account applicable 
plans for the most preferred goal(s). This can be easily achieved by using function 
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A + (agt, q, II) in the definition of d BDl (agt, q, II): 

A+(agt,q,II) = 

{0[c# G A(agt,q,n) 

(37 G gl(agt,q)).ip \= 7, 

(^[oV G A{agt,q,n),i €gl(agt,q)) 
l' < ag t 7 A V' N 7'}- 

That is, Z\ + (agf, q, 77) denotes the set of all highest-priority applicable plans at q: 
there is no plan <fi'[a'}ip' applicable at q and serving a higher-priority goal 7'. 

We note that the complexity of model checking this variant with goal priorities 
remains the same as that of BDI-ATLES. Indeed, the size of the extended model is still 
exponential w.r.t. the original model. This is because the agent might achieve lower 
priority goals as non-intended side effects, and as a result, one still needs to extend the 
original states with the powerset of the initial goal base. 

Nonetheless, the BDI-ATLES p variant is more general; if agents have equal priori- 
ties on all their goals, then it is equivalent to the original BDI-ATLES framework. On 
the other hand, priorities on goals can yield very different abilities among agents and 
coalitions. In fact, the ability of an agent (or coalition) may depend on the priority of 
other agents: an gold miner agent may be able to pick gold pieces only if the opponent 
prefers exploring grid to picking gold. 

Lastly, it is not difficult to provide different semantics to goal priorities. One can 
imagine agents dropping goals or blindly waiting for a plan to become applicable when 
none is available, rather than suspending the goal and acting on lower priority goals. 
While it is straightforward to capture such semantics, the exponential complexity for 
model checking still remains. 

B.2 Maintenance goals 

In order to force goals to "re-appear" when violated, we re-define the goal-marking 
function gg {■,■), which determines the goal base of an agent at a given i-th moment in 
a trace A + , as follows: 

3e(A+,*) ^{jljeG, V(A+[t])(= 7 }. 

Importantly, the rest of the definitions for plan execution and rational strategies remain 
exactly the same. We shall refer to this alternate version of our logic as BDI-ATLES M 
In terms of the model checking algorithm, the only change required is in the defini- 
tion of a goal-extended model M e , which is defined as above except for the following: 

- The set of extended states Q e is defined as follows: 

Q e = 
{(q,gi, ■ ■ -,9\A e \) 

q e Q,gi = ete]\{7 \V(q) \= 7}}; 

- <r e (q e ,a) = q' e iff a(ws(q e ),a) = ws{q' e ). 

It is not difficult to see now that the set of extended states is no longer exponential 
w.r.t. the original set of states. 

4 Note we do now allow achievement and maintenance goals to co-exist in the current setting. 



